Correlation and Prediction of Evaluation Metrics in Information Retrieval
نویسندگان
چکیده
Because researchers typically do not have the time or space to present more than a few evaluation metrics in any published study, it can be difficult to assess relative effectiveness of prior methods for unreported metrics when baselining a new method or conducting a systematic meta-review. While sharing of study data would help alleviate this, recent attempts to encourage consistent sharing have been largely unsuccessful. Instead, we propose to enable relative comparisons with prior work across arbitrary metrics by predicting unreported metrics given one or more reported metrics. In addition, we further investigate prediction of high-cost evaluation measures using low-cost measures as a potential strategy for reducing evaluation cost. We begin by assessing the correlation between 23 IR metrics using 8 TREC test collections. Measuring prediction error wrt. R and Kendall’s τ , we show that accurate prediction of MAP, P@10, and RBP can be achieved using only 2-3 other metrics. With regard to lowering evaluation cost, we show that RBP(p=0.95) can be predicted with high accuracy using measures with only evaluation depth of 30. Taken together, our findings provide a valuable proof-of-concept which we expect to spur follow-on work by others in proposing more sophisticated models for metric prediction.
منابع مشابه
Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines
Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...
متن کاملEvaluation of Classifiers in Software Fault-Proneness Prediction
Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملA Plan for Making Information Retrieval Evaluation Synonymous with Human Performance Prediction
Today human performance on search tasks and information retrieval evaluation metrics are loosely coupled. Instead, information retrieval evaluation should be a direct prediction of human performance rather than a related measurement of ranked list quality. We propose a TREC track or other group effort that will collect a large amount of human usage data on search tasks and then measure particip...
متن کاملPerformance Evaluation of Medical Image Retrieval Systems Based on a Systematic Review of the Current Literature
Background and Aim: Image, as a kind of information vehicle which can convey a large volume of information, is important especially in medicine field. Existence of different attributes of image features and various search algorithms in medical image retrieval systems and lack of an authority to evaluate the quality of retrieval systems, make a systematic review in medical image retrieval system...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.00323 شماره
صفحات -
تاریخ انتشار 2018